On the Role of Global Warming on the Statistics of Record-Breaking Temperatures 
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We theoretically study the statistics of record-breaking daily temperatures and validate these 
predictions using both Monte Carlo simulations and 126 years of available data from the city of 
Philadelphia. Using extreme statistics, we derive the number and the magnitude of record temper- 
ature events, based on the observed Gaussian daily temperature distribution in Philadelphia, as a 
function of the number of years of observation. We then consider the case of global warming, where 
the mean temperature systematically increases with time. Over the 126-year time range of observa- 
tions, we argue that the current warming rate is insufficient to measurably influence the frequency 
of record temperature events, a conclusion that is supported by numerical simulations and by the 
Philadelphia data. We also study the role of correlations between temperatures on successive days 
and find that they do not affect the frequency or magnitude of record temperature events. 

PACS numbers: 92.60.Ry, 92.60.Wc, 92.70.-j, 02.50.Cw 



I. INTRODUCTION 



Almost every summer, there is a heat wave somewhere 
in the US that garners popular media attention [l[ . Dur- 
ing such hot spells, daily record high temperatures for 
various cities are routinely reported in local news reports. 
A natural question arises: is global warming the cause 
of such heat waves or are they merely statistical fluctu- 
ations? Intuitively, record-breaking temperature events 
should become less frequent with time if the average tem- 
perature is stationary. Thus it is natural to be concerned 
that global warming is playing a role when there is a pro- 
liferation of record-breaking temperature events. In this 
work, we investigate how systematic climatic changes, 
such as global warming, affect the magnitude and fre- 
quency of record-breaking temperatures. We then as- 
sess the potential role of global warming by comparing 
our predictions both to record temperature data and to 
Monte Carlo simulation results. 

It bears emphasizing that record-breaking tempera- 
tures are distinct from threshold events, defined as ob- 
servations that fall outside a specified threshold of the 
climatological temperature distribution Q ■ Thus, for ex- 
ample, if a city's record temperature for a particular day 
is 40°C, then an increase in the frequency of daily tem- 
peratures above 36° C (i.e., above the 90 th percentile) 
is a threshold event, but not a record-breaking event. 
Trends in threshold temperature events are also impacted 
by climate change, and is thus an active research area 
IS El IE H 13 • Studying threshold events is also one 
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of the ways to assess agricultural, ecological, and human 
health effects due to climate change [8j, 9] . 

Here we examine the complementary issue of record- 
breaking temperatures, in part because they are popular- 
ized by the media during heat waves and they influence 
public perception of climate change, and in part because 
of the fundamental issues associated with record statis- 
tics. We focus on daily temperature extremes in the city 
of Philadelphia, for which data are readily available on 
the Internet for the period 1874-1999 [2|. In partic- 
ular, we study how temperature records evolve in time 
for each fixed day of the year. That is, if a record tem- 
perature occurs on January 1, 1875, how long until the 
next record on January 1 occurs? Using the fact that 
the daily temperature distribution is well approximated 
by a Gaussian (Sec. MB]) , we will apply basic ideas from 
extreme value statistics in Sec. Mil to predict the magni- 
tude of the temperature jump when a new record is set, 
as well as the time between successive records on a given 
day. These predictions are derived for an arbitrary daily 
temperature distribution, and then we work out specific 
results for the idealized case of an exponential daily tem- 
perature distribution and for the more realistic Gaussian 
distribution. 

Although individual record temperature events are 
fluctuating quantities, the average size of the tempera- 
ture jumps between successive records and the frequency 
of these records are systematic functions of time (see, e.g., 
PH for a general discussion). This systematic behavior 
permits us to make meaningful comparisons between our 
theoretical predictions, numerical simulations (Sec. IIV|) . 
and the data for record temperature events in Philadel- 
phia (Sec. [V}. Clearly, it would be desirable to study 
long-term temperature data from many locations to dis- 
criminate between the expected number of record events 
for a stationary climate and for global warming. For U.S. 
cities, however, daily temperature records extend back 
only 100-140 years [12|, LL3|, and there are both gaps in 
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the data and questions about systematic effects caused 
by "heat islands" for observation points in urban areas. 
In spite of these practical limitations, the Philadelphia 
data provide a useful testing ground for our theoretical 
predictions. 

In Sec. IVIi we inve stig ate the effect of a slow linear 
global warming trend jl4 128| on the statistics of record- 
high and record-low temperature events. We argue that 
the presently-available 126 years of data in Philadelphia, 
coupled with the current global warming rate, are in- 
sufficient to meaningfully alter the frequency of record 
temperature events compared to predictions based on a 
stationary temperature. This conclusion is our main re- 
sult. Finally, we study the role of correlations in the 
daily temperatures on the statistics of record tempera- 
ture events in Sec. IVIII Although there are substantial 
correlations between temperatures on nearby days and 
record temperature events tend to occur in streaks, these 
correlations do not affect the frequency of record tem- 
perature events for a given day. We summarize and offer 
some perspectives in Sec. I Villi 



II. TEMPERATURE OBSERVATIONS 

The temperature data for Philadelphia were obtained 
from a website of the Earth and Mineral Sciences depart- 
ment at Pennsylvania State University [lo| . The data 
contain both the low and high temperatures in Philadel- 
phia for each day between 1874 and 1999. The data are 
reported as an integer in degrees Fahrenheit, so we antic- 
ipate an error of ±1°F. No information is provided about 
the accuracy of the measurement or the precise location 
where the temperature is measured. Thus there is no 
provision for correcting for the heat island effect if the 
weather station is in an increasingly urbanized location 
during the observation period. For each day, we also doc- 
ument the middle temperature, defined as the average of 
the daily high and daily low. 

To get a feeling for the nature of the data, we first 
present basic observations about the average annual tem- 
perature and the variation of the temperature during a 
typical year. 



A. Annual averages and extremes 

Figure [T] shows the average annual high, middle, and 
low temperature for each year between 1874 and 1999. 
To help discern systematic trends, we also plot 10-year 
averages for each data set. The average high tempera- 
ture for each year is increasing from 1874 until approxi- 
mately 1950 and again after 1965, but is decreasing from 
1950 to 1965. Over the 126 years of data, a linear fit 
to the time dependence of the annual high temperature 
for Philadelphia gives an increase of 1.62°C, compared to 
the well-documented global warming rate of 0.6 ± 0.2°C 
over the past century [9j . On the other hand, there does 



not appear to be a systematic trend in the dependence 
of the annual low temperature on the year. A linear fit 
to these data give a decrease of — 0.38°C. This disparity 
between high and low temperatures is a puzzling and as 
yet unexplained feature of the data. 
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FIG. 1: (Color online) Average annual high, middle, and low 
temperature (in degrees Celsius) for each year between 1874 
and 1999 (jagged dotted lines). Also shown are the corre- 
sponding 10-year averages (solid curves). 

A basic feature about the daily temperature is its ap- 
proximately sinusoidal annual variation (Fig. [2]). The 
coldest time of the year is early February while the 
warmest is late July. An amusing curiosity is the dis- 
cernible small peak during the period January 20-25. 
This anomaly is the traditional "January thaw" in the 
northeastern US where sometimes snowpack can melt 
and a spring-like aura occurs before winter returns (see 
[l5| for a detailed discussion of this phenomenon). 
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FIG. 2: (Color online) Record high, average high, middle, and 
low, and record low temperature (in degrees Celsius) for each 
day of the year. 

Also shown, in Fig. [21 are the temperature extremes for 
each day. The highest recorded temperature in Philadel- 
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phiaof 41.1°C (106°F) occurred on August 7, 1918, while 
the lowest temperature of — 23.9°C (— 11°F) occurred on 
February 9, 1934. Record temperatures also fluctuate 
more strongly than the mean temperature because there 
are only 126 years of temperature data. As a result of 
this short time span, some days of the year have experi- 
enced very few records and the resulting current extreme 
temperature can be far from the value that is expected 
on statistical grounds (see Sec. |V|) . 



B. Daily temperature distribution 

To understand the magnitude and frequency of daily 
record temperatures, we need the underlying tempera- 
ture distribution for each day of the year. Because tem- 
peratures have been recorded for only 126 years, the 
temperature distribution for each individual day is not 
smooth. To mitigate this problem, we aggregate the tem- 
peratures over a 9-day range and then use these aggre- 
gated data to define the temperature distribution for the 
middle day in this range. Thus, for example, for the 
temperature distribution on January 5, we aggregate all 
126 years of temperatures from January 1-9 (1134 data 
points). We also use the middle temperature for each day 
to define the temperature distribution. 
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FIG. 3: (Color online) Nine-day aggregated temperature dis- 
tributions for January 5, April 5, October 5, and July 5 in 
degrees Celsius (top to bottom). Each data set is averaged 
over a 10% range — 10, 9, 8, and 6 points respectively, for Jan- 
uary 5, April 5, October 5, and July 5. The distributions are 
all shifted horizontally by the mean temperature for the day 
and then vertically to render all curves distinct. The dashed 
curves are visually-determined Gaussian fits. 

Figure [3] shows these aggregated temperature distri- 
butions for four representative days — the 5 th of January, 
April, July, and October. Each distribution is shifted 
vertically to make them all non-overlapping. We also 
subtracted the mean temperature from each of the dis- 



tributions, so that they are all centered about zero. Vi- 
sually, we obtain good fits to these distributions with the 
Gaussian P(AT) cx e ~( AT ) 2 /^\ where AT is the devia- 
tion of the temperature from its mean value (in °C), and 
with a « 5.07, 4.32, 4.12, and 3.14 for January 5, April 
5, October 5, and July 5, respectively. We therefore use a 
Gaussian daily temperature distribution as the input to 
our investigation of the frequency of record temperatures 
in the next section. 

An important caveat needs to be made about the daily 
temperature distribution. Physically, this distribution 
cannot be Gaussian ad infinitum. Instead, the distri- 
bution must cut off more sharply at finite temperature 
values that reflect basic physical limitations (such as the 
boiling points of water and nitrogen). We will show in 
the next section that such a cutoff strongly influences 
the average waiting time between successive temperature 
records on a given day. 

Notice that the width of the daily temperature dis- 
tribution is largest in the winter and smallest in the 
summer. Another intriguing aspect of the daily distri- 
butions is the tail behavior. For January 5, there are 
deviations from a Gaussian at both at the high- and low- 
temperature extremes, while for April 5 and October 5, 
there is an enhancement only on the high-temperature 
side. This enhancement is especially pronounced on April 
5, which corresponds to the season where record high 
temperatures are most likely to occur (see Sec. IVIIll and 
Fig. [T3")) . What is not possible to determine with 126 
years of data is whether the true temperature distribu- 
tion is Gaussian up to the cutoff points and the enhance- 
ment results from relatively few data, or whether the true 
temperature distribution on April 5 actually has a slower 
than Gaussian high-temperature decay. 



III. EVOLUTION OF RECORD 
TEMPERATURES 

We now determine theoretically the frequency and 
magnitude of record temperature events. The schematic 
evolution of these two characteristics is sketched in Fig.Q] 
for the case of record high temperatures. Each time a 
record high for a fixed day of the year is set, we docu- 
ment the year ti when the i th record occurred and the 
corresponding record high temperature Ti. Under the 
(unrealistic) assumptions that the temperatures for each 
day are independent and identical, we now calculate the 
average values of Ti and ti and their underlying prob- 
ability distributions (For a general discussion of record 
statistics for excursions past a fixed threshold, see e.g. 
[Til fl(| , while related work on the evolution of records is 
given in Ref. [ijj )• 

Suppose that the daily temperature distribution is 
p(T). Two subsidiary distributions needed for record 
statistics are: (i) the probability that a randomly-drawn 
temperature exceeds T, p > (T), and (ii) the probability 
that that this randomly-selected temperature is less than 
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FIG. 4: Schematic evolution of the record high temperature 
on a specified day for each passing year. Each dot represents 
the daily high temperature for different years. The first tem- 
perature is, by definition, the zeroth record temperature To. 
This event occurs in year to = 0. Successive record tempera- 
tures Ti, T2, T3, . . . occur in years ti, ti, tz, . . .. 



T, p < (T). These distributions are [li 



P<(T) 



p(T')dT'; p>(T) 



p(T')dT'. (1) 



We now determine the fc th record temperature T k re- 
cursively. We use the terminology of record high temper- 
atures, but the same formalism applies for record lows. 
Clearly To coincides with the mean of the daily temper- 
ature distribution, To = J Q Tp(T) dT. The next record 
temperature is the mean value of that portion of the tem- 
perature distribution that lies beyond T : that is, 



Ti 



J™T P (T)dT 
f™p(T)dT 



(2) 



This formula actually contains a sleight of hand. More 
properly, we should average the above expression over the 
probability distribution for To to obtain the true average 
value of Ti , rather than merely using the typical or the 
average value of T in the lower limit of the integral. 
Equation ([2]) therefore does not give the true average 
value of T 1; but rather gives what we term the typical 
value of T\. We will show how to compute the average 
value shortly. 

Proceeding recursively, the relation between successive 
typical record temperatures is given by 



Tk+i = 



J™T P (T)dT 
f™p(T)dT 



(3) 



where the above caveat about using the typical value of 
T fe in the lower limit, rather than the average over the 
(as yet) unknown distribution of T k , still applies. 

We now compute 7 k (T), the probability that the k th 
record temperature equals T; this distribution is subject 
to the initial condition 3 o (T) = p(T). For the k th record 
temperature, the following conditions must be satisfied 
(refer to Fig. f?]): (i) the previous record temperature 
T' must be less than T, (ii) the next n temperatures, 
with n arbitrary, must all be less than T', and (hi) the 



last temperature must equal T. Writing the appropriate 
probabilities for each of these events, we obtain 

y k (T) = ^£ ? fc _ 1 (T')|><(r')] n dT'^J p(T) 



r yj ^dAp { T) 

to P>( T ) ) 



(4) 



The above formula recursively gives the probability dis- 
tribution for each record temperature in terms of the dis- 
tribution for the previous record. 

Complementary to the magnitude of record tempera- 
tures, we determine the time between successive records. 
Suppose that the current record temperature equals T k 
and let q n (Tk) be the probability that a new record 
high — the (fc + l) st — is set n years later. For this new 
record, the first n — 1 highs after the current record must 
all be less than T k , while the n th high temperature must 
exceed Tk- Thus 



Qn(T k ) = P< (T k ) n - x P> (T k ) 



(5) 



The number of years between the k th record high T k and 
the (A; + l) st record T fc+1 is therefore 



tk+i — tk 



1 



n=l 



<P< P> 



P>(T k 



(6) 



We emphasize that this waiting time gives the time be- 
tween the k th record and the (k + l) st record when the 
k th record temperature equals the specified value T k . If 
the typical value of T k is used in Eq. ([6]) , we thus obtain 
a quantity that we term the typical value of t k . 

To obtain the true average waiting time, we first define 
Q n (k) as the probability that the k th record is broken af- 
ter n additional temperature observations, averaged over 
the distribution for T k . Using the definition of q n , we 
obtain the formal expression 



■? k (T)q n (T) dT 

V h ^)P<(T) n ~ 1 P>{T)dT. (7) 



Different app roaches to determine the Q n are given in 
Refs. mEl. 

There are a number of fundamental results available 
about record statistics that are universal and do not de- 
pend on the form of the initial daily temperature dis- 
tribution, as long as the daily temperatures are inde- 
pendent and identically-distributed (iid) continuous vari- 
ables 0, Ell H3, HH H4I . In a string of n + 1 observations 
(starting at time n = 0), there are n! permutations of the 
temperatures out of (n + 1)! total possibilities in which 
the largest temperature is the last of the string. Thus 
the probability that a new record occurs in the n th year 



(8) 



of observation, R n , is simply [14j, HJ, |2fl, |21|, |2 
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In a similar vein, the probability that the initial (0 th ) 
record is broken at the n th observation, Q„(0), requires 
that the last temperature is the largest while the th tem- 
perature is the second largest out of n + 1 independent 
variables. The probability for this event is therefore 



Q„(o) 



n(n+l) ' 



(9) 



again independent of the form of the daily temperature 
distribution. Thus the average waiting time between the 
zeroth and first records, (n) = J2n°=i n Qn(ty is infinite! 

More generally, the distribution of times between suc- 
cessive records can be obtained by simple reasoning 
(20l l2l| . Consider a string of iid random variables that 
are labeled by the time index n, with n = 0, 1, 2, . . . ,t. 
Define the indicator function 



1 if record occurs in n 
otherwise. 



tb 



year 



(10) 



By definition, the probability for a record to occur in the 



,th 



year is R n = 



—^rrr. Therefore the average 

n+l G 



number of records that have occurred up to time t is 



(R n ) = £<£T„) -Int. 



(11) 



Moreover, because the order of all non-record events is 
immaterial in the probability for a record event, there 
are no correlations between the times of two successive 
record events. That is, (a m cr n ) = (er m )(<7„). Thus the 
probability distribution of records is described by a Pois- 
son process in which the mean number of records up to 
time t is Int. Consequently, the probability H(n,t) that 
n records have occurred up to time t is given by |20| 



II(n,t) 



(In*)" c -mt 



{\nt) 7 



(12) 



To appreciate the implications of these formulae for 
record statistics, we first consider the warm-up exercise 
of an exponential daily temperature distribution. For 
this case, all calculations can be performed explicitly and 
the results provide intuition into the nature of record 
temperature statistics. We then turn to the more realistic 
case of the Gaussian temperature distribution. 



A. Exponential distribution 

Suppose that the temperature distribution for each day 
of the year isp(T) = T _1 e~ T / T . Equation |T]) then gives 



P<(T) = l 



-T/7 . 



P>(T) 



-T/7 



(13) 



We now determine the typical value of each T k . The 
zeroth record temperature is To = f Q T p(T) dT = 7. 



Performing the integrals in Eq. ([3]) successively for each 
k gives the basic result 



T k = (k + 1)7, 



(14) 



namely, a constant jump between typical values of suc- 
cessive record temperatures. 

For the probability distribution for each record temper- 
ature, we compute Tfe(T) one at a time for k = 0, 1 , 2, . . . 
using Eq. This gives the gamma distribution [23| 



1 T k 
M 7 k+1 



,-T/T 



(15) 



This distribution reproduces the typical values of suc- 
cessive temperature records given by Eq. (fT4]k thus the 
typical and true average values for each record temper- 
ature happen to be identical for an exponential temper- 
ature distribution. The standard deviation of Tk{T) is 
given by \J (T 2 ) — (T) 2 = 7\Jk + 1, so that successive 
record temperatures become less sharply localized as k 
increases. 

For the typical time between the k th and (k + l) st 
records, Eq. © gives 



tk 



1 



P>(T k ) 



(16) 



Substituting T k = (k + 1)7 into Eq. (16]), the typical 
time is e Tfc / T = e^ k+1 \ Thus records become less likely 
as the years elapse. Notice that the time between records 
does not depend on T because of a cancellation between 
the size of the temperature "barrier" (the current record) 
and the size of the jump to surmount the record. 

For the distribution of waiting times between records, 
we first consider the time between To and T\ in detail to 
illustrate our approach. Substituting Eqs. (fl3|) and (fTS"]) 
into Eq. j7]), this distribution is 



-T/T (1 _ e -T/T )n -l e -T/T dT ^ 



Performing this integral by parts gives the result of 
Eq. ©, Q„(0) =l/[n(n + l)]. 

For later applications, however, we determine the 
large-n behavior of Q„(0) by an asymptotic analysis. 
Defining x = T/7, we rewrite Eq. ([IT)) for large n as 



Qn(0) = / e- x {l-e- x ) n ^e~ x dx 
Jo 



e -2x e -ne~ 



dx. 



(18) 



The double exponential in the integrand changes sud- 
denly from to 1 when n — e x , or x — Inn. To estimate 
Qn(0), we may omit the double exponential in the inte- 
grand and simply replace the lower limit of the integral by 
Inn. This approach immediately leads to Q n (0) ~ n~ 2 , 
in agreement with the exact result. 
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In general, the average waiting time between the k th 
and (k + l) st record is, from Eq. (0, 

rpk 



f°°l 

Qn(k)=j - 



I Jfc+1 



o-T/7 



(l-e-^^e-^dT. (19) 



While we can express this integral exactly in terms of 
derivatives of the j3 function [24j . it is more useful to 
determine its asymptotic behavior by the same analysis 
as that given in Eq. (fT8|) . We thus rewrite (1 — e~ x ) n ~ 1 as 
a double exponential and use the fact that this function 
is sharply cut off for x < Inn to reduce the integral of 
Eq. Ull) to 



Qn(k) 



In n 



— e~ 2x dx . 
k\ 



(20) 



To find the asymptotic behavior of this integral, we note 
that the integrand has a maximum at x* = k/2. Thus for 
n > x* , the exponential decay term controls the integral 
and we may again estimate its value by taking the inte- 
grand at the lower limit to give Q n (k) cx (lnn) k /n 2 . As 
a result of the power-law tail, the average waiting time 
between any two consecutive records is infinite. 

However, the observationally meaningful quantity is 
the typical value of the waiting time and we thus focus 
on typical values to characterize the steps between suc- 
cessive records depicted in Fig. [4] The typical time to 
reach the fc th record, tk, is simply the sum of the typical 
times between records. Thus 



(t k - tfc_i) + (t k -i - t k -i) + . . . + (* 2 - *i) + *i 



e k - 1 



e 2 + e 1 



-1.58e*. (21) 

Equivalently, In tk ~ k + 0.459 so that Eq. (|T4"|) gives 
T k w (lntfe + 0.541)T. Therefore the k th record high tem- 
perature increases logarithmically with the total number 
of observations, as expected from basic extreme statistics 
considerations (l8| . 

After k record temperatures for a given day have 
been set, the probability for the next record to occur 
is p>(Tfe) = e~ Tfc / T . Since T& sw T lntfc, we recast this 
probability as a function of time to obtain 

p> (t) =e- T */ T oce- ln * = 1/i, (22) 



thus reproducing the general result in [14j, |19j, |20|, |21|, |22| . 
The annual number of record temperatures after t years 
should be 365 /t; for the Philadelphia data, this gives 2.90 
record temperatures for the year 2000, 126 years after the 
start of observations. 



B. Gaussian distribution 

We now study record temperature statistics for the 
more realistic case of a Gaussian daily temperature dis- 
tribution. Again, to avoid the divergence caused the un- 
physical infinite limits in the Gaussian, we begin by com- 
puting the typical value Tk of the k th record temperature, 



and the typical time tk until this record. While the cal- 
culational steps to obtain these quantities are identical 
to those of the previous subsection, the details are more 
complicated because the integrals for p< and p> must be 
evaluated numerically or asymptotically. 

As will become evident, the mean value in the Gaus- 
sian merely sets the value of To and plays no further role 
in successive record temperatures. Thus for the daily 
temperature distribution, we use the canonical form 



P(T) = 



1 



-T 2 /2a 2 



V2 



7T(J 



(23) 



to determine the values of successive record tempera- 
tures. The exceedance probability then is 



P>(T) 



1 

T VZ7TCT 2 

1 a 



-x 2 /2<7 : 



dx = ierfc(V/\/2^ 



^ T 



t2 ^ 2 T»V2^, 



(24) 



where erfc(z) is the complementary error function |24j |. 

Clearly, To = 0, since the Gaussian distribution is sym- 
metric. If we had used a Gaussian with a non-zero mean 
value, then all the Tk would merely be shifted higher 
by this mean value. For the next record temperature, 
Eq. © gives 



Ti 



f°° 1 Tc -T 2 /2a 2 dT 
JO V2 ^ 1 e al 

rOC 
JO 77 



. e -T 2 /2* 2 dT 



(25) 



Substituting u = T 2 /2a 2 and v = T / v / 2cr 2 ~ in the numer- 
ator and denominator, respectively, we obtain 



Ti = 



JO ,/2^ e UU 



7T 



| erfc(0) 

Continuing this recursive computation, Eq. ([3]) gives 



(26) 



T 



f 
ii 



, Te -T 2 /2„ 2 dT 



k+1 



Ti e- T "/ 2 ^ 2 
eric(T k /V2c7 2 



r T>/2*> dT 



(27) 



For the first few k, it is necessary to evaluate the error 
function numerically and we find Ti rs 1.712 Ti, T3 « 
2.288 Ti, T 4 = 2.782 Ti, etc. Now from Eq. fl2B]>, the 
argument of the error function in Eq. ([27j) is Tk / V / 2o 2 " = 
Tk/(Tiy/w). Thus for k > 3, this argument is greater 
than 1, and it becomes increasingly accurate to use the 
large-z asymptotic form [24( 



erfc(z) 



1 - 



1 

2^ 



This approximation reduces the recursion for 2V|_i to 



T, 



fc+i 



erfc(T fc /\/2^ 2 ~) 



^ V 2(T fc /V 



2(T 2 ) 2 



Tfc ( 1 , ^ 



rp2 / ' 

± k J 



(28) 



where we have used T\ — \/2cr 2 Jtt from Eq. 

Writing the last line as — Tj. = a 2 /T^, approx- 
imating the difference by a derivative, and integrating, 
the fc th record temperature for large k has the remark- 
ably simple form 



T h ~ V2fccr 2 . 



(29) 



Thus successive record temperatures asymptotically be- 
come more closely spaced for the Gaussian distribution. 
It should be noted, however, that the largest number 
of record temperature events on any given day in the 
Philadelphia data is 10, so that the applicability of the 
asymptotic approximation is necessarily limited. 

The more fundamental measure of the temperature 
jumps is again J'fc(T), the probability distribution that 
the k th record high equals T. For a Gaussian daily 
temperature distribution, the general recursion given in 
Eq. (H|) for Tk(T) is no longer exactly soluble, but we can 
give an approximate solution that we expect will become 
more accurate as k is increased. We merely employ the 
large-T asymptotic form for p>(T) in the recursion for 
Tk{T) even when k is small so that T is not necessarily 
much larger than a. Using this approach, we thus obtain 
for Ti(T) 



0\(T) 




f 



V27R7 2 



T 2 /2a 2 



2<7 2 v 7 ^ 2 



(30) 



Continuing this straightforward recursive procedure then 
gives 



1 



T 



2 k 



r(fc + i) ( T 2fe+1 



3 -T 2 /2cr 2 



(31) 



where the amplitude is determined after the fact by de- 
manding that the distribution is normalized. 

In spite of the crudeness of this approximation, this 
distribution agrees reasonably with our numerical simu- 
lation results shown in Fig. [5] (details of the simulation 
are described in the following section). The distributions 
7k{T) move systematically to higher temperatures and 
become progressively narrower as k increases, in accor- 
dance with naive intuition. The approximate form of 
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FIG. 5: Simulation data for the probability distribution of the 
k th record high temperature in degrees Celsius, Tk{T). The 
distribution 7o{T) coincides with the Gaussian of Eq. (|23[1 . 
whose parameters match the average temperature and dis- 
persion in Philadelphia. The solid curves correspond to a 
stationary temperature, while the dashed curves correspond 
to global warming with rate v — 0.012°C year -1 (see Sec. IVI|I . 



Eq. pip gives a similar shape to the simulated distribu- 
tions, but there is an overall shift to higher temperatures 
by roughly 1-2°C. 

Next, we study the typical time between successive 
record temperatures. Equation ([6]) states that tk+i—tk = 
l/(p > (Tfc)). Using the above asymptotic expansion of the 
complementary error function in the integral for p > and 
\Jlka 1 from Eq. 



T, 



tk+i 



we obtain, for large k, 



o T 2 /2a 2 



(32) 



Again, the times between records are independent of er; 
this independence arises because both the size of the 
record and the magnitude of the jumps to surpass the 
record are proportional to <r, so that its value cancels out 
in the waiting times. 

Finally, we compute the asymptotic behavior for the 
distribution of waiting times between records. For sim- 
plicity, we consider only the waiting time distribution 
<5n(0) until the first record. The distribution of waiting 
times for subsequent records has the same asymptotic 
tail as Q n (0), but also contains more complicated pre- 
asymptotic factors. Substituting the Gaussian for p(T) 
and the asymptotic form for p > (T) into Eq. ([7]) , and then 
expanding (1— p>)™ -1 as a double exponential, we obtain 



° l 

27TX 



exp 



27TX 2 



-x 2 /2a 2 



dx. (33) 



The double exponential again cuts off th e integra l when 
x is less than a threshold value x* ~ V2cr 2 Inn. As a 
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result, Eq. (|33j) reduces to 

Qn(0) ~ / — e-" 2 ^ 2 dx ~ -j . (34) 

In the final result, we drop logarithmic corrections be- 
cause the approximation made in writing Eq. (|3"3"j) also 
contains errors of the same magnitude. Thus the dis- 
tribution of waiting times n until the first record again 
has a n~ 2 power-law tail and the mean waiting time is 
infinite. 

The typical time until the fc th record is again given 
by the sum of successive time intervals. Asymptotically, 
Eq. ([32} gives 

rk 

i fc ~ / V^im e" - V4^fc e fe , (35) 
Jo 

or fe w lnt — 4ln(47rlnt). Thus the number of records 
grows slowly with time; this result has the obvious con- 
sequence that records become less likely to occur at later 
times. 

IV. MONTE CARLO SIMULATIONS 

To verify our theoretical derivations, Monte Carlo sim- 
ulations were performed for both the exponential and 
Gaussian temperature distributions. Our simulations 
typically involve 10 5 realizations (days) over a minimum 
of 1000 years of observations, and continue until six 
record temperatures have been achieved. We use "years" 
consisting of 10 5 days so that we generate a sufficient 
number of record temperatures to have reasonable statis- 
tics. For our initial simulations, we used a stationary 
mean and variance of 18°C and 5°C respectively, which 
are typical values for the distribution of maximum daily 
temperatures in the spring or fall in Philadelphia. How- 
ever, the numerical validation of our theoretical distribu- 
tions does not depend on the particular values of mean 
and variance. 

The simulation errors using an exponential distribu- 
tion for the fc th record (with k = 0,1, ... ,5) are: less than 
3 x 10~ 5 for 3> fe (T) (Eq. $T5§) using a distribution with 
100 bins; 8.3 x 10~ 5 for Q n (0) (Eq. ©); 2.2 x 10~ 3 (rel- 
ative error) for the mean temperature of the A: th record 
temperature (Eq. |[T4"])); and 0.01 (relative error) for the 
variance. The Gaussian distribution yields fewer exact 
expressions for comparison, but includes a relative error 
of 6.4 x 10~ 3 for the mean temperature of the k th record 
temperature (Eq. (f2"8")0 , k = 0...5. For both the ex- 
ponential and Gaussian distributions, the probability of 
breaking a record temperature with time is well fit by the 
form l/(i+l), with an error of less than 9.2x 10 -5 . These 
errors decrease as the number of realizations increases, 
and the small errors for simulations with 10 5 realizations 
confirm the correctness of the theoretical distributions. 

Monte Carlo simulations were also performed to ex- 
plore the effect of temporal correlations in daily temper- 
atures on the frequency statistics of record-temperature 



events and the magnitude of successive record tempera- 
tures. This topic will be discussed in detail in Se c. I VIII 
We used the Fourier filtering analysis method [25|, |26| to 
generate power-law correlations between daily tempera- 
ture data for years consisting of 10 4 days over 200 years 
and for several values of the exponent in the power law 
of the temporal correlation function. 



V. RECORD TEMPERATURE DATA 

Between 1874 and 1999, a total of 1707 record highs 
(4.68 for each day on average) and 1343 record lows (3.68 
for each day) occurred in Philadelphia [27j • Because the 
temperature was reported as an integer, a temperature 
equaling a current record could represent a new record 
if the measurement was more accurate. With the less 
stringent definition that a new record either exceeds or 
equals the current record, the number of record high and 
record low events over 126 years increased from 1707 to 
2126 and from 1343 to 1793, respectively. However, this 
alternative definition does not qualitatively change the 
statistical properties of record temperature events. 
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FIG. 6: (Color online) Average k th record high (A) and record 
low (V) temperature for each day, divided by the daily tem- 
perature dispersion, versus k (from the Philadelphia temper- 
ature data). The dashed curve is T k /a = l.Wfc. 

To compare with our theory, first consider the size of 
successive record temperatures. According to Eq. (|29l) . 
the k th record high (and record low) temperature should 
be proportional to \/2kcr 2 . Because the mean tempera- 
ture for each day has already been subtracted off, here Tk 
denotes the absolute value of the difference between the 
k th record temperature and the zeroth record. To have 
a statistically meaningful quantity, we compute Tk/a a 
for each day of the year, and then average over the entire 
year; here the subscript a = h,l denotes the daily disper- 
sion for the high and low temperatures, respectively. As 



9 




10" 



time 



FIG. 7: Probability that a record high temperature (top) or 
record low (bottom) occurs at a time t (in years) after the 
start of observations. The symbols A and V are 10-point 
averages of Philadelphia data from 1874 to 1999 for ease of 
visualization. Simulated data were produced by a stationary 
Gaussian distribution (y = 0), or where the mean increases 
according to v = 0.003, 0.006, or 0.012C year" 1 . The sta- 
tionary data fit the theoretical expectation of l/(t + 1) (thick 
dashed line), while warming leads the distribution to asymp- 
tote to a constant probability (thin dashed lines). 
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FIG. 8: Probability that the k th record high temperature oc- 
curs at time t (in years) or later, using simulated data (solid 
curves). The k — 1 simulated data closely match the asymp- 
totic theoretical distribution of 1/t (dashed line). Also shown 
are the k = 1 data for record high temperatures (A) and 
record low temperatures (V) for the Philadelphia data. 



the time between the k th and the (k + l) st record tem- 
peratures on a given day is n years or larger. As shown in 
Fig. [5J the agreement between the Philadelphia data and 
the theoretical prediction from Eq. Q„(0) a 1/n 

is quite good. The Monte Carlo simulations match the 
theoretical prediction nearly exactly, with an rms error 
of9xlCT 5 . 

In summary, the data for the magnitude of temper- 
ature jumps at each successive record, the frequency 
of record events, and the distribution of times between 
records are consistent with the theoretical predictions 
that arise from a Gaussian daily temperature distribu- 
tion with a stationary mean temperature. 



VI. SYSTEMATICALLY CHANGING 
TEMPERATURE 



shown in Fig. [5J the annual average for T& / <r a is consis- 
tent with \fk growth for both the record high and record 
low temperature. Up to the 6 th record, both data sets 
are quite close, and where the data begin to diverge, the 
number of days with more than 6 records is small — 69 
for high temperatures and 26 for low temperatures. 

Finally, we study the evolution of the frequency of 
record temperature days as a function of time. As dis- 
cussed in Sec. IIII1 the number of records in the t th 
year of observation (since 1874) should be 365/i. In 
spite of the year-to-year fluctuations in the number 
of records, the prediction 365/i fits the overall trend 
(Fig. [7]). We also examine the distribution of waiting 
times between records. Since the amount of data is 
small, it is useful to study the cumulative distribution, 
Q n (k) = J2m= n Qfn(k), defined as the probability that 



We now study how a systematically changing average 
temperature affects the evolution of record temperature 
events. For global warming, we assume that the mean 
temperature has a slow superimposed time dependence 
vt, with v > and where t is the time (in years) after 
the initial observational year. 



A. Exponential distribution 

Again, as a warm-up exercise, we first consider the 
idealized case of an exponential daily temperature distri- 
bution, 



p(T;t) 




T >vt 
T < vt , 



(36) 
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where we set the characteristic temperature scale T to 1 
for simplicity. In these units, both T and vt are dimcn- 
sionless. With this distribution, the recursion Eq. (|3|) for 
successive record temperatures becomes 



T, 



fc+i 



J 1 k 



-(T-vthv) dT 



(37) 



The factor e vtk+1 appears in both the numerator and de- 
nominator and thus cancels. As a result, Tk = k+1, inde- 
pendent of v. Thus a systematic temperature variation — 
either global warming or global cooling — does not affect 
the magnitude of the jumps in successive record high 
temperatures. This fact was verified by numerical simu- 
lations with an exponential distribution, where the dis- 
tributions of ? fe (T) for v = 0.012°C year" 1 and v = 
match to within a few percent for k = . . . 5. 

On the other hand, a systematic temperature depen- 
dence does affect the time between records. Suppose that 
the current record high temperature of Tk was set in year 
tk ■ Then the exceedance probability at time tk + j is 



V>(T k ;t k +j) 



-[T-v(t k +j)] dT 



= e - (Tk ~ vtk) e 3V = X e jv . (38) 



The exceedance probability is thus either enhanced or 
suppressed by a factor e v due to global warming or cool- 
ing, respectively, for each elapsed year. The probability 
Qn(Tk) that a new record high temperature occurs n years 
after the previous record T k at time t k is 



q n (Tk) 



X 



o3 v 



(39) 



with gi(Tfe) = e v X; this generalizes Eq. ([5]) to incorpo- 
rate a global climatic change. 

For the case of global warming (v > 0), each successive 
term in the product decreases in magnitude and there is 
a value of j for which the factor (1 — e J " X) is no longer 
positive. At this point, the next temperature must be 
a new record. Thus we (over)estimate the time until 
the next record after Tk by the criterion (1 — e^ v X) = 
0, or j — (Tk — vtk)/v ~ (k/v) — tk- Since this value 
of j also coincides with tt+i — tk by construction, we 
obtain tk ~ k/v. Thus the time between consecutive 
records asymptotically varies as tk+i — tk ~ l/v. This 
conclusion agrees with a previous mathematical proof of 
the constancy of the rate of new records when a linear 
temporal trend is superimposed on a set of continuous iid 
variables (28|; a different approach to deal with a linear 
trend is given in [29l | . 

If global warming is slow, the waiting time between 
records will initially increase exponentially with k, as in 
the case of a stationary temperature, but then there will 
be a crossover to the asymptotic regime where the wait- 
ing time is constant. We estimate the crossover time by 
equating the two forms for the waiting times, tk+i —tk = 



e (&+i) (stationary temperature) and tk+i —t k = l/v (in- 
creasing temperature), to give k* rs —\txv. Now the 
average annual high temperature in Philadelphia has in- 
creased by approximately 1.94°C over 126 years. The 
resulting warming rate of 0.0154°C per year then gives 
k* w 3.6. Thus the statistics of the first 3.6 record high 
temperatures should be indistinguishable from those in 
a stationary climate, after which record temperatures 
should occur at a constant rate. Since the average num- 
ber of record high temperatures for a given day is 4.7 
and the time until the next record high is very roughly 
e 5,7 — e 4 7 « 190 years, we are still far from the point 
where global warming could have an unambiguous effect 
on the frequency of record high temperatures. 

For global cooling (v < 0) , the waiting time probability 
becomes 



q n (T k )= Y[(l-e-^Y)e 

3=1 



Y. 



(40) 



with qi(Tk) = e~ w Y , where w = \v\ is positive, and 
Y = e~ Tk ~ wtk . We estimate the above product by the 
following simple approach. When jw < 1, then e~i w <C 
1, and each factor within the product is approximately 
(1 — Y). Consequently, for nw > 1, each term in the 
product approximately equals (1 — Y) for j < n* = 1/w, 
while for j > n*, e~ JW ~ 0, and the later terms in the 
product are all equal to 1. Thus 



q n (T k 



(1 - Y) n e- nw Y n < n* 
(1 -Y) x l w e~ nw Y n>n* 



(41) 



Using this form for q n , we find, after straightforward 
but slightly tedious algebra, that the dominant contribu- 
tion to the waiting time until the next record tempera- 
ture, tk+i — tk — S^Li n< ln, comes from the terms with 
ii < n* in the sum. For the case slow global cooling, we 
thereby find 



+ 1 



tk 



1/Y 



[l + w(l/Y 

T k +wt k 



I)] 5 



1/Y 



(42) 



Since tft+i — tk ~ dt/dk and using TJ. ~ k, Eq. (|42p can 
be integrated to give (1 — e~ wtk ) — w(e k — 1). As long as 
the right-hand side is less than 1, a solution for tk exists. 
In the converse case, there is no solution and thus no 
additional record highs under global cooling, or equiva- 
lently, no more record lows for global warming. For small 
w and in the pre-crossover regime where e k w t k , the cri- 
terion for no more records reduces to t > 1/w. If the 
daily low temperature in Philadelphia also experienced 
a warming rate of 0.0154°C per year, then there should 
be no additional record low temperatures after about 36 
years of observations. However, the daily low tempera- 
tures do not show a long-term systematic variation, so 
new record lows should continue to occur, as is observed. 
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B. Gaussian distribution 

We now treat the more realistic case where a system- 
atic temperature variation is superimposed on a Gaussian 
daily temperature distribution, as embodied by 

p{ T-t) = n L=e-^) 2 ^ 2 . (43) 
V 2wa z 

The details of the effects of a systematic temperature 
variation on the statistics of record temperatures are te- 
dious and we merely summarize the main results. We 
assume a slow systematic variation, — vt 3> 0, so that 
an asymptotic analysis will be valid. Under this approx- 
imation, both global warming or global cooling lead to 
the following recursion for T/., to leading order: 

T t+1 -n~-(l + -y ,44, 

The term proportional to vt in Eq. (|44[) is subdominant, 
so that Tfc still scales as ~ v / 2fccr 2 , for both global warm- 
ing and global cooling. 

Next we determine the times between successive record 
high temperatures. The basic quantity that underlies 
these waiting times is again the exceedance probability, 
when the current record is and the current time is 
tfc + j. Following Eq. ([2"4")1 , this exceedance probability is 

P>(T fc ;t fe+J )^ierfc(^-^M). (45) 

In the asymptotic limit where the argument of the com- 
plementary error function is large, the controlling factor 
in p> is 

e -[T-v(T k + 3 )] 2 /2a 2 _ e -(T-vt k ) 2 /2a 2 e vj(T-vt k )/* 2 _ ^gs 

The crucial point is that the latter form for the ex- 
ceedance probability has the same j dependence as in 
the exponential distribution (Eq. ([58)1 ). Thus our argu- 
ments for the role of global warming with an exponen- 
tial daily temperature distribution continue to apply. In 
particular, the time between successive records initially 
grows as ^/Ank e fc , but then asymptotically approaches 
the constant value 1/v. As a result, the time before 
global warming measurably influences the frequency of 
record high and record low temperatures will be simi- 
lar for both the exponential and Gaussian temperature 
distributions. 

Monte Carlo simulations were performed for warming 
rates v = 0.003, 0.006, and 0.012°C/year, where the mid- 
dle case corresponds to the accepted rate of global mean 
warming of 0.6°C for the 20 th century [|. Unlike the ex- 
ponential distribution simulations, for the Gaussian dis- 
tribution TkiT) is slightly different in the cases of no 
warming and warming (Fig. [5j- 

Figure [7] shows the results of numerical simulations us- 
ing the Gaussian distribution with 10 5 realizations for the 



three warming rates. For the stationary case (v = 0), the 
probability of breaking a record after t years closely fol- 
lows the theoretical expectation of l/(t + 1). For warm- 
ing, the rate of breaking a record high (Fig. [7) top) ul- 
timately asymptotes to a constant frequency of approx- 
imately 1.25t? by 10 4 years. Given our crude calculation 
following Eq. ([39)) that the time between records is 1/v, 
the agreement between the observed rate of 1.25u and 
our estimate of v is gratifying. As also predicted in our 
theory, the probability of breaking a record low temper- 
ature under global warming precipitously decays after a 
few hundred years (Fig. [7) bottom); eventually record low 
temperatures simply stop occurring in a warming world. 



VII. ROLE OF TEMPORAL CORRELATIONS 

Thus far our presentation has been based on indepen- 
dent daily temperatures — no correlations between tem- 
peratures on successive days. However, from common ex- 
perience we know that local weather consists of multi-day 
patterns within which smaller temperature variations oc- 
cur. Anecdotally, the temperature tomorrow will be close 
to the temperature today. In fact, it has been found in 
global climatological data that correlations between tem- 
peratures on two widel y se parated days decay as a power 
law in the separation [30(. Here we quantify these cor- 
relations for the Philadelphia data and then discuss the 
potential ramifications of these correlations on the fre- 
quency of record temperature events. 



A. Daily temperature correlation data 

From the Philadelphia data, we compute the normal- 
ized interday temperature correlation function defined as 

C a i) mizMM (47 ) 

Here i and j > i denote the i th and j th days of the year, 
Ti is the temperature on the i th day, and (Ti) is its av- 
erage value over the 126 years of data, while the index 
a = h,m,l denotes the high, middle, and low temper- 
ature for each day. If i is a day near the end of the 
year, then Tj will refer to a temperature in the following 
year when the separation between the two days exceeds 
(365 — i). According to Eq. ([47jl . if the temperatures Tj 
and Tj are both greater than or both less than the respec- 
tive average temperatures for days i and j, then there is 
a positive contribution to the correlation function. Thus 
c a (i, j) measures systematic temperature deviations from 
the mean on these two days. For convenience, we nor- 
malize the c a so that they all equal 1 when \i — j\ = 0. 

The correlation functions depend primarily on the sep- 
aration between the two days, \i — j\, and weakly on the 
initial day i. To obtain a succinct measure of the tem- 
perature correlation over a year, we define the annual 
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FIG. 9: (Color online) The correlation functions C a {i) for 
high (A), middle (o), and low temperature (V) versus time 
(in years). The straight line of slope —4/3 is a guide for the 
eye. 

average correlation function 

365 

C a (t) = J2 c *(hi + t). (48) 

All three correlations functions are consistent with a 
power law decay C a (t) ~ t^ 1 (Fig. [5]). Over a range 
of approximately 1-20 days, the best fit value of 7 is 1.29 
for Ch (which remains strictly positive until 36 days) and 
7 = 1.44 for C m (which remains strictly positive until 
41 days). The correlation function C; is visibly distinct 
and remains strictly positive until 149 days, with a best- 
fit exponent of 7 = 1.36. These power-law decays in 
the temperature correlation functions are consistent with 
the previous results of Ref. [30(. However, the exponent 
value that we observe, approximately 4/3, is consider- 
ably larger than that reported in Ref. [30] ■ The time 
integrals of the high-, middle-, and low-temperature cor- 
relation functions are 1.78, 2.04 and 5.16 respectively We 
may therefore view 1.78 as the average length of an in- 
dependent high-temperature event, and correspondingly 
365/1.78 « 205 as the number of effective independent 
"days" for high temperatures. Parallel results hold for 
middle and low temperatures. These numbers provide a 
feeling for the extent of multiday weather patterns be- 
cause of temperature correlations. 

B. Simulations with correlated daily temperatures 

To determine if these correlations affect the frequency 
and magnitude of record temperature events, we per- 
formed Monte Carlo simulations in which daily temper- 
atures had temporal correlations that matched the data 
discussed above. We generate such correlated data us- 
ing the Fourier filtering method of Refs. (2f| [26| with a 



correlation function of the form 

C a (t)=t-"> (49) 

for a range of 7 values around the observed value of 
1.3-1.4. Due to the computational demands of gen- 
erating correlated data, simulations of years consisting 
of 10 days for 200 years were performed, which are 
less extensive than our simulations for uncorrelated tem- 
peratures. We find that the statistics of the time be- 
tween record temperature events and the magnitude of 
successive record temperatures are virtually identical to 
those obtained when the temperature is an independent 
identically-distributed random variable. (Figs. [TU] and 
ITTT) . Our results are also not sensitive to the value of the 
decay exponent 7 of the correlation function, within our 
tested range of 7 € [0.5, 1.5]. We conclude that the dis- 
cussion in Sees. IIIIH VII which assumed uncorrelated day- 
to-day temperatures, can be applied to real atmospheric 
observations, where daily temperatures are correlated. It 
is worth mentioning, however, that interday correlations 
do strongly affect the statistics of successive extremes in 
temperatures [3lj |. 




temperature 

FIG. 10: Simulation data for the probability distribution of 
the k th record high temperature in degrees Celsius, 3>fc(T), 
where daily temperatures are uncorrelated (solid line) and 
power-law correlated with exponent 1.5. (dashed line). 



C. Correlations between record temperature events 

While temperature correlations do not affect record 
statistics for a given day, these correlations should cause 
records to occur as part of a heat wave or a cold snap, 
rather than being singular one-day events. As a matter 
of curiosity, we studied the distribution of times (in days) 
between successive record events, as well as the distribu- 
tion of streaks (consecutive days) of record temperatures 
from the time history of all record temperature events. 

Because the number of record temperatures decreases 
from year to year, these time and streak distributions are 
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FIG. 11: Probability that the k record high temperature oc- 
curs at time t (in years) or later, using uncorrelated (solid line) 
and power- law correlated daily temperatures (dashed line). 

not stationary. We compensate for this non-stationarity 
by rescaling so that data for all years can be treated 
on the same footing. For example, for the distribution 
of times between successive records, we rescale each in- 
terevent time by the average time between records for 
that year. Thus, for example, if two successive records 
occurred 78 days apart in a year where 5 record temper- 
ature events occurred (average separation of 73 days), 
the scaled separation between these two events is r = 
78/73 « 1.068. For the length of record streaks, we sim- 
ilarly rescaled each streak by the average streak length 
in that year, assuming record temperature events were 
uncorrelated. 
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FIG. 12: (Color online) Distribution of times p(r) between 
successive record temperature events (A record highs, V 
record lows). The times are scaled by the average time be- 
tween record events for each year. 

The distribution of times between successive record 
temperature days decays slower than exponentially 
(Fig. [T2")) ; the latter form would occur if record temper- 



ature events were uncorrelated. In a similar vein, we 
observe an enhanced probability for records to occur in 
streaks. Since record streaks are rare, we can only make 
the qualitative statement that the streak distribution is 
different than that from uncorrelated data. Our basic 
conclusion is that interday temperature correlations do 
affect statistical features of successive record temperature 
events but do not affect the statistics of record temper- 
atures on a given day, where events are more than one 
year apart. 



VIII. DISCUSSION 

Two basic aspects of record temperature events are the 
size of the temperature jump when a new record occurs 
and the separation in years between successive records on 
a given day. We computed the distribution functions for 
these two properties by extreme statistics reasoning. For 
the Gaussian daily temperature distribution, we found 
that (i) the fc th record high temperature asymptotically 
grows as y/k a, where a is the dispersion in the daily tem- 
perature, and (ii) record events become progressively less 
likely, with the typical time between the fc th and (k + l) st 
record growing as yfk e k . This latter result is independent 
of a so that systematic changes in temperature variability 
should not affect the time between temperature records. 

From these predictions, the distribution of waiting 
times between two successive records on a given day has 
an inverse-square power-law tail, with a divergent aver- 
age waiting time. Furthermore, the number of record 
events in the t th year of observations decays as t' 1 
0, El M HI HI- These theoretical predictions agree 
with numerical simulations and with data from 126 years 
of observations in Philadelphia. Another important fea- 
ture is that the annual frequency of record temperature 
events is not measurably influenced by interday power- 
law temperature However, these correlations do play a 
significant role at shorter time scales. 

Our primary result is that we cannot yet distinguish 
between the effects of random fluctuations and long-term 
systematic trends on the frequency of record-breaking 
temperatures with 126 years of data. For example, in the 
100 th year of observation, there should be 365/100 = 3.65 
record-high temperature events in a stationary climate, 
while our simulations give 4.74 such events in a climate 
that is warming at a rate of 0.6°C per 100 years. How- 
ever, the variation from year to year in the frequency 
of record events after 100 years is larger than the differ- 
ence of 4.74 — 3.65, which should be expected because of 
global warming (Fig. [7]). After 200 years, this random 
variation in the frequency of record events is still larger 
than the effect of global warming. On the other hand, 
global warming already does affect the frequency of ex- 
treme temperature events that are defined by exceeding 
a fixed threshold @, S H H @, . 

While the agreement between our theory and the data 
for record temperature statistics is satisfying, there are 
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FIG. 13: (Color online) Number of high-temperature records 
for each day of the year, averaged over a 30-day range (top). 
Below are the variances in the high (A), middle (o), and low 
(V) temperatures for each day averaged over a 10-day range. 



age temperature systematically increases, (iv) Day/night 
or high/low asymmetry [33]. That is, as a function of 
time there are more days whose highs exceeds a given 
threshold and fewer days whose high is less than a thresh- 
old. Paradoxically, however, there are fewer days whose 
lows exceed a given temperature and more days whose 
lows are less than a given temperature. Since highs gen- 
erally occur in daytime and lows in nighttime, these re- 
sults can be restated as follows: the number of hot days 
is increasing and the number of cold nights is increasing. 
We don't know how this latter statement fits with the 
phenomenon of global warming. 

Another caveat is that our theory applies in the asymp- 
totic limit, where each day has experienced a large num- 
ber of record temperatures over the observational history. 
The fact that there are no more than 10 record events on 
any single day means that we are far from the regime 
where the asymptotic limit truly applies. Finally, and 
very importantly, it would be useful to obtain long-term 
temperature data from many stations to provide a more 
definitive test of our predictions. 



various facts that we have either glossed over or ig- 
nored. These include: (i) a significant difference be- 
tween the number of record high and record low events — 
1705 record high events and only 1346 record low events 
have occurred the 126 years of data, (ii) A propensity 
for record high temperatures in the early spring. This 
seasonality is illustrated both by the number of records 
for each day of the year and by the daily temperature 
variance ^ = ^{Tf) - (T t ) 2 , where (T 2 ) and (Tf) are 
the mean and mean-square temperatures for the i th day 
(Fig. IT5]) . (iii) The potential role of a systematically in- 
creasing variability on the frequency of records. For the 
last point, Krug [32j has shown that for an exponential 
daily temperature distribution whose width is increasing 
linearly with time, the number of record events after t 
years grows as (lnt) 2 , intermediate to the hit growth of a 
stationary distribution and linear growth when the aver- 
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